A Self-Training Subspace Clustering

نویسندگان

  • Chun - Qiu Xia
  • Ke Han
  • Yong Qi
  • Yang Zhang
  • Dong - Jun Yu
چکیده

Accurate identification of the cancer types is essential to cancer diagnoses and treatments. Since cancer tissue and normal tissue have different gene expression, gene expression data can be used as an efficient feature source for cancer classification. However, accurate cancer classification directly using original gene expression profiles remains challenging due to the intrinsic high-dimension feature and the small size of the data samples. We proposed a new self-training subspace clustering algorithm under low-rank representation, called SSC-LRR, for cancer classification on gene expression data. Low-rank representation (LRR) is first applied to extract discriminative features from the high-dimensional gene expression data; self-training subspace clustering (SSC) method is then used to generate the cancer classification predictions. The SSC-LRR was tested on two separate benchmark datasets in control with four state of the art classification methods. It generated cancer classification predictions with an overall accuracy 89.7% and a general correlation 0.920, which are 18.9% and 24.4% higher than that of the best control method respectively. In addition, several genes (RNF114, HLA-DRB5, USP9Y and PTPN20) were identified by SSC-LRR as new cancer identifiers that deserve further clinical investigation. Overall, the study demonstrated a new sensitive avenue to recognize cancer classifications from large-scale gene expression data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Subspace Clustering Networks

We present a novel deep neural network architecture for unsupervised subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the “selfexpressiveness” property that has proven effective in traditional subspace clusteri...

متن کامل

A Shift Tolerant Dictionary Training Method

Traditional dictionary learning method work by vectorizing long signals, and training on the frames of the data, thereby restricting the learning to time-localized atoms. We study a shift-tolerant approach to learning dictionaries, whereby the features are learned by training on shifted versions of the signal of interest. We propose an optimized Subspace Clustering learning method to accommodat...

متن کامل

Learning Transformations for Clustering and Classification Learning Transformations for Clustering and Classification

A low-rank transformation learning framework for subspace clustering and classification is here proposed. Many high-dimensional data, such as face images and motion sequences, approximately lie in a union of low-dimensional subspaces. The corresponding subspace clustering problem has been extensively studied in the literature to partition such highdimensional data into clusters corresponding to...

متن کامل

Subspace clustering for complex data

Clustering is an established data mining technique for grouping objects based on their mutual similarity. Since in today’s applications, however, usually many characteristics for each object are recorded, one cannot expect to find similar objects by considering all attributes together. In contrast, valuable clusters are hidden in subspace projections of the data. As a general solution to this p...

متن کامل

Learning transformations for clustering and classification

A low-rank transformation learning framework for subspace clustering and classification is here proposed. Many high-dimensional data, such as face images and motion sequences, approximately lie in a union of low-dimensional subspaces. The corresponding subspace clustering problem has been extensively studied in the literature to partition such highdimensional data into clusters corresponding to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017